9 research outputs found

    Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification

    Full text link
    Due to the modality gap between visible and infrared images with high visual ambiguity, learning \textbf{diverse} modality-shared semantic concepts for visible-infrared person re-identification (VI-ReID) remains a challenging problem. Body shape is one of the significant modality-shared cues for VI-ReID. To dig more diverse modality-shared cues, we expect that erasing body-shape-related semantic concepts in the learned features can force the ReID model to extract more and other modality-shared features for identification. To this end, we propose shape-erased feature learning paradigm that decorrelates modality-shared features in two orthogonal subspaces. Jointly learning shape-related feature in one subspace and shape-erased features in the orthogonal complement achieves a conditional mutual information maximization between shape-erased feature and identity discarding body shape information, thus enhancing the diversity of the learned representation explicitly. Extensive experiments on SYSU-MM01, RegDB, and HITSZ-VCM datasets demonstrate the effectiveness of our method.Comment: CVPR 202

    One for More: Selecting Generalizable Samples for Generalizable ReID Model

    Full text link
    Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch. It will inevitably cause the model to over-fit the data in the dominant position (e.g., head data in imbalanced class, easy samples or noisy samples). %We call the sample that updates the model towards generalizing on more data a generalizable sample. The latest resampling methods address the issue by designing specific criterion to select specific samples that trains the model generalize more on certain type of data (e.g., hard samples, tail data), which is not adaptive to the inconsistent real world ReID data distributions. Therefore, instead of simply presuming on what samples are generalizable, this paper proposes a one-for-more training objective that directly takes the generalization ability of selected samples as a loss function and learn a sampler to automatically select generalizable samples. More importantly, our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework which is able to simultaneously train ReID models and the sampler in an end-to-end fashion. The experimental results show that our method can effectively improve the ReID model training and boost the performance of ReID models

    Joint Bilateral-Resolution Identity Modeling for Cross-Resolution Person Re-Identification

    No full text
    Person images captured by public surveillance cameras often have low resolutions (LRs), along with uncontrolled pose variations, background clutter and occlusion. These issues cause the resolution mismatch problem when matched with high-resolution (HR) gallery images (typically available during collection), harming the person re-identification (re-id) performance. While a number of methods have been introduced based on the joint learning of super-resolution and person re-id, they ignore specific discriminant identity information encoded in LR person images, leading to ineffective model performance. In this work, we propose a novel joint bilateral-resolution identity modeling method that concurrently performs HR-specific identity feature learning with super-resolution, LR-specific identity feature learning, and person re-id optimization. We also introduce an adaptive ensemble algorithm for handling different low resolutions. Extensive evaluations validate the advantages of our method over related state-of-the-art re-id and super-resolution methods on cross-resolution re-id benchmarks. An important discovery is that leveraging LR-specific identity information enables a simple cascade of super-resolution and person re-id learning to achieve state-of-the-art performance, without elaborate model design nor bells and whistles, which has not been investigated before
    corecore